updated: 2022-01-23_12:32:31-05:00
Non-Context Free Languages
Pumping Lemma
Introduction and Givens
let G be a CFG for a CFL L
let b be the maximum number of terminals on the RHS of a rule
let w be the string derived by G
let v be the number of variables
w = w1, w2, w3...
S => ... => ... => ... => w
^ length (number of derivations) is k steps long
At every step in derivation, we can get at most b new terminals
Example
S > aTb
T > aU
U > bV
V > baV | e
S => aTb => aaUb => aabVb => aabbaVb => aabbab
Number of derivations is 5. 5 is more than 4 (variables)
one variable must loop
length of w for derivation of length k:
A derivation of length k contains
- How many variables if repetition is allowed?
- How many variables if repetition is not allowed?
If we pick k > |v| then at least two intermediate steps will contain a variable
i.e. A variable will be repeated in derivation
To ensure this repetition we choose w such that
-
- Choosing this guarantees a repetition of a variable in the derivation
$S \stackrel{}{\Rightarrow} uAz \stackrel{}{\Rightarrow} vAy\stackrel{\star}{\Rightarrow} w$
A is the variable that is repeating
S{
u
A{
v
A{
x
}
y
}
z...
}
If you have a grammar like this, we can get no repetitions, one repetitions, or many repetitions
What does this mean?
for k 0
if k = 0 -> uxz
if k = 1 -> uvxyz
if k = 2 -> uvxyz
We want to come up with a string that has has repeats
Theorem
- If L is a CFL, then there is a number P > 0 called pumping length such that if wL and |w|p, then w can be written as uvxyz where:
- |vxy| P
- vy e
- uvxyz L for all k 0
Proof:
Let L be a CFL and let G be the CFG that generates L
Let b be the maximum number of symbols on RHS of any rule
V is the number of variables
Let p = max(b+1,b)
Parse tree:
Number of steps is the same as the height of the parse tree
max length of string is b
consider the smallest possible parse tree for w
Subtree w:
Parse tree
because |w| b the height of the tree must be greater than |v|
Let be one of the longest paths in the tree, length of > |v|
this means contains repetition among the bottom |v| + 1 variables
then w = uvxyz is generated by repeating the portion of the tree that denotes "VAY"
parse tree can be constructed for every string of form uvxyz L for all k 0
height >= |V| implies repetition
Example: a^n b^n c^n
L = {}
Suppose L is CFL
let p be the pumping length provided by PL
consider and
clearly
According to PL, w can be represented as uvxyz where:
1. |vxy| P
2. vy e
3. uvxyz L for all k 0
vxy could be anywhere in the string. Can't say it's all a's or b's because of that. There are multiple cases
Two cases:
- v or y contain more than one kind of symbol (2 symbols, either a & b or b & c)
- let k = 2
-
- v or y have 2 characters, so repeating them means they are out of order when power 2 applies
- so if v x y was aa a ab => aaaa a abab
- now there is an a after a b...
- also, the number of b's and c's is different
- v & y each contain one type of symbol
- Options:
- v is all a's; y is all b's
- v is all b's; y is all c's
One of these cases must occur, a contradiction must occur
- v is all a's; y is all b's
Example: ww
L = {ww} (not a palindrome)
bla bla proof stuff
- |vxy| P
- vy e
- uvxyz L for all k 0
let w =
w consists of 4 blocks of length p
| vxy | p => vxy can can touch at most two blocks
Case 1
suppose that v and y are both contained within a single block (same symbol)
say block 1. ie: vxy is all 0's
let k = 2
=> for i > 0
if v & y are all 1's:
is also L
Now, for HW, look at multiple cases
Case 2
suppose v & y each have two different symbols
0's and 1's mixed up
because the 0's and 1's are not in order
therefore not in the language, and since neither is in language, not a CFL
...
Example: L = {a^i b^j c^k | 0 <= i <= j <= k}
Let L be a CFL and let p be the pumping length given by a pumping length
w = a b c
- we are choosing this because we know a^n b^n c^n not cfl, but a little different since we will have to pump up due to the less than
|w| p
therefore according to pumping lemma w can be represented as uvxyz where:
- |vxy| P
- vy e
- uvxyz L for all k 0
Case 1
Both v and y contain the same symbol
three subcases.. can't be generalized
- v & y are in the middle (all b's)
- b^p => vxy
- pumping down ...
- |a| > |b|
- soooo
- because |a| > |b|
- v and y are in the 1st part of the L (all a's)
- pumping down won't work, so we need to pump up
- Pumping up
- k = 2
- v and y are in the last part of L (all C's)
- pump down
- pump down
Case 2
both v and y contain different symbols
can be generalized:
- v is all a's and some b's ; y is all b's
- v is all b's and some c's; y is all c's
OR - v is some a's
- b is some a's and some b's
you can say "either v or y contain 2 different symbols" (thanks Nick!)
either way, the order is going to be wrong
- like if v = aaabbb, v^2 is aaabbbbaaabbb
- now a's are after b's
Pump up:
blabla, therefore not in set of CFL etc
Homework by Friday:
(from book)
2.30:
a) {0 1 0 1 | n 0} is not a CFL
d) {t # t # ... # t | k 2, each t {a,b} and t = t for some i j}
2.33:
show that f = {a b | i = kj for some positive integer k} is not a CFL
Example: L = {1^i # 1^j # 1^{i+j} | i <= j}
" CFL Pumping lemma stuff
choose w = 1 # 1 # 1
uvxyz
Case 1
either v or y contains #
- uvxyz => 1##1#1
- uvxyz => 11 # 1
^^ one is pumping up, one is down, we don't need to do both, just show one
She is reexplaining this case...
uvxyz => #1#1#1^p ... too many hashes
probably ignore this to not get confused
Case 2
neither v or y contain #
v & y could be first, second, or third block of the string (like between the #)
they will fall within a block
- first block (v & y are in first block)
- length of , v and y are in 1st block
- soo, we should pump up
- uvxyz L because i > j
- second block (v & y are in second block)
- pump down
- uvxyz L bc i>j
- third block (v & y are in third block)
- pump up
- uvxyz => 1#1#
- p+p
to be continued